The effects of room acoustics on MFCC speech parameter
نویسندگان
چکیده
Automatic speech recognition systems attain high performance for close-talking applications, but they deteriorate significantly in distant-talking environment. The reason is the mismatch between training and testing conditions. We have carried out a research work for a better understanding of the effects of room acoustics on speech feature by comparing simultaneous recordings of close talking and distant talking speech utterances. The characteristics of two degrading sources, background noise and room reverberation are discussed. Their impacts on the spectrum are different. The noise affects on the valley of the spectrum while the reverberation causes the distortion at the peaks at the pitch frequency and its multiples. In the situation of very few training data, we attempt to choose the efficient compensation approaches in the spectrum, spectrum subband or cepstrum domain. Vector Quantization based model is used to study the influence of the variation on feature vector distribution. The results of speaker identification experiments are presented for both close-talking and distant talking data.
منابع مشابه
Feature Level Compensation for Robust Speaker Identification in Mismatched Conditions
In this paper, robust front end features are proposed for improvement in speaker identification (SI) performance by considering the factors of real world situations, like mismatch between training and testing conditions. The most commonly used MFCC features are very much sensitive to effects such as channel and environment mismatch. Characteristics of speech gets changed with room acoustics, ch...
متن کاملImproving the performance of MFCC for Persian robust speech recognition
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...
متن کاملMusic Instrument Identification Using MFCC: Erhu as an Example
In the analysis of musical acoustics, we usually use the power spectrum to describe the difference between timbres from two music instruments. However, according to our experiments, the power spectrum cannot be used as effective features for erhu instrument identification. In this paper, we use MFCC (mel-scale frequency cepstral coefficients) as features for music instrument identification usin...
متن کاملSpeaker recognition via fusion of subglottal features and MFCCs
Motivated by the speaker-specificity and stationarity of subglottal acoustics, this paper investigates the utility of subglottal cepstral coefficients (SGCCs) for speaker identification (SID) and verification (SV). SGCCs can be computed using accelerometer recordings of subglottal acoustics, but such an approach is infeasible in real-world scenarios. To estimate SGCCs from speech signals, we ad...
متن کاملBoundary Conditions for Room Acoustic Simulations
Research on room acoustic simulation focuses on more accurate modeling of wave effects in rooms. Today, also wave models (e.g., the boundary element method and the finite differences in time domain technique) can be used for higher frequencies, thus, in the geometrical acoustics (GA) domain. Simulations in architectural acoustics are powerful tools but their reliability depends on the input dat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000